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Abstract 

The theoretical study of social learning typically assumes that each agent’s 
action affects only her own payoff. In this paper, I present a model in which agents’ 
actions directly affect the payoffs of other agents. On a discrete time line, there 
is a community containing a random number of agents in each period. Before 
each agent needs to take an action, the community receives a private signal about 
the underlying state of the world and may observe some past actions in previous 
communities. An agent’s payoff is higher if her action matches the state or if 
more agents take the same action as hers. I analyze two observation structures: 
exogenous observation and costly strategic observation. In both cases, coordination 
motives enhance social learning in the sense that agents take the correct action with 
significantly higher probability when the community size is greater than a threshold. 
In particular, this probability reaches one (asymptotic learning) with unbounded 
private beliefs and can be arbitrarily close to one with bounded private beliefs. I 
then discuss the issue of multiple equilibria and use risk dominance as a criterion 
for equilibrium selection. I find that in the selected equilibria, the community size 
has no effect on learning under exogenous observation, facilitates learning under 
endogenous observation and unbounded private beliefs, and either helps or hinders 
learning under endogenous observation and bounded private beliefs. 

Keywords: Information aggregation, Social earning, Coordination, Herding, 
Information cascade, Information acquisition 

JEL Classification: A14, C72, D62, D83, D85 

*Department of Economics, UCLA. Email: darcy07@ucla.edu. 


1 



1 Introduction 


The study of social learning focuses on how valuable information is transmitted in a 
society of self-interested and strategic agents as well as how dispersed and decentralized 
information is aggregated to facilitate greater precision of knowledge. A typical situation 
involves a large number of individuals who make a single decision sequentially. The 
payoff of this decision depends on an unknown state of the world, about which each 
individual is given a noisy signal. The state of the world may refer to different economic 
variables in different applications, such as the quality of a new product, the return on 
an investment opportunity, or the intrinsic value of a research project. The probabilistic 
distribution of signals depends on the state and is assumed to be distinctive for each 
possible value of this state. Hence, if signals were observable, the aggregation of signals 
would be sufficient for individuals to ultimately learn the value of the state with near 
certainty. However, because signals are private and often cannot be transmitted via 
direct communication, an individual must extract information from observations of her 
predecessors’ decisions to determine her own actions. A general and important question 
thus arises: what behaviors and observation structures can lead to the level of learning 
achieved by efficient information aggregation? In other words, under what conditions will 
observation reveal the true state, and how likely is it that agents will make the correct 
decision? 

The above framework has been adopted widely in the literature, including but not 
exclusive to the notable study of herding behavior and information cascades in various 
applications, such as investments [32], bank runs[17] and technology adoption[13]. Among 
the literature that provides a theoretical analysis, renowned early research by Bikhchan- 
dani, Hirshleifer and Welch[7], Banerjee[5] and Smith and Sorensen[33] demonstrates that 
efficient information aggregation may fail: in a perfect Bayesian equilibrium, individuals 
eventually herd in choosing the wrong action with positive probability. Recent works 
such as Acemoglu et al.[l] consider a more general observation structure and note that 
society’s learning of the true state depends on two factors: the possibility of arbitrarily 
strong private signals and the nonexistence of excessively influential individuals. 

However, despite the large body of theoretical literature on social learning and in¬ 
formation externalities, most models fail to account for a crucial factor that influences 
individual strategic behaviors: coordination motives. In an environment with coordina- 
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tion motives, one agent’s action may directly affect another agent’s payoff. The absence of 
this effect greatly limits the range of applications that can be analyzed nsing the standard 
framework becanse coordination motives are prevalent in many strategic environments 
that involve social learning, ranging from the choice of computer software to the choice 
of research areas. In addition, the very existence of coordination motives often facili¬ 
tates local information sharing in both signaling and observations because individuals 
in such situations have mutual interests in such sharing. Hence, one should expect to 
see very different patterns of action as well as information updating in the observational 
learning process. Finally, most existing studies assume that observation is given by some 
exogenous stochastic process, but in many applications, it is part of an agent’s strategic 
decision. Presumably, once observation becomes a choice, it should have an immediate 
effect on the accuracy of action and should also change how coordination motives in¬ 
fluence social learning. Therefore, a more general framework is needed to include these 
important elements in the study of social learning and to fully understand their impact. 

To focus on the typical strategic environment with the above features, consider the 
following example. There is a group of consumers who need to decide which one of two 
possible smartphones to switch to in their usage. The sequence of actions is determined 
by the expiration dates of their current contract. Among this group, there are smaller 
“communities” of consumers (for example, college friends who enroll in the same wireless 
phone package) that make their decisions within a relatively small period of time. For a 
consumer in an arbitrary community, because interaction is more convenient among peo¬ 
ple using the same model, she prefers that others use the model that she chooses. Before 
she makes her own decision, she may observe some previously made decisions from other 
communities. Such observations may be exogenously given: they may simply come from 
noticing which smartphones other people in her social network are using. Alternatively, 
observations may be strategically chosen: the consumer can pay a registration fee to en¬ 
ter an online forum where she can see other consumers’ choices with corresponding time 
stamps. If she is not able to view all the available posts, her rational choice is to select 
the most informative posts. Finally, regardless of the observation structure, she will most 
likely share her observations with others in her community but not with outsiders. 

In this paper, I propose a model that is consistent with the framework of Bikhchan- 
dani, Hirshleifer and Welch[7] and Acemoglu et al.[l] and is simultaneously flexible to 
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analyze coordination motives under different observation structures. More formally, there 
is an underlying state of the world that is binary in value and cannot be observed directly. 
On an inhnite and discrete time line, there is a community of random size in each period, 
and its members (the agents) each take a binary action simultaneously. The payoff of 
an agent depends on whether her action matches the state as well as what actions are 
taken by others in her community. The more agents take the same action as she does, the 
higher the payoff she enjoys. At the beginning of the period, the agents in the community 
obtain a noisy signal about the true state^. The value of this signal is common knowledge 
within the community but cannot be observed by any other community. 

After obtaining the signal but before taking the action, the agents simultaneously 
observe a subset of actions of their predecessors, i.e., agents from previous communities. 
The observed actions are locally shared information as well: in other words, actions ob¬ 
served by one agent are also observed by every other agent within the same community. 
Observation is exogenous if it is pre-determined, regardless of the signal and the com¬ 
munity size. Observation is endogenous if each agent can choose to pay a hxed cost and 
select a given number of ordered actions to observe. 

Hence, there are three central determinants of the pattern of social learning: signal 
strength (bounded or unbounded beliefs), observation structure (exogenous or endoge¬ 
nous) and strength of coordination motives (community size). This paper establishes 
the hrst theoretical framework to understand the interaction among these factors and to 
answer the question of when observation is truth-revealing and whether asymptotic learn¬ 
ing occurs in this more realistic but complex environment. In particular, I highlight the 
contrast between the pattern of learning in a singleton community and that in a poten¬ 
tially large community. As can be expected, coordination motives in a large community 
bring about an additional “desire to conform” that is absent when each agent cares only 
about her own action. However, as suggested by my major hndings summarized below, 
the incentive to decide based on the group is not entirely negative. On the contrary, 
coordination motives may improve learning in each combination of signal structure and 
observation structure. 

^The assumption of one signal for each community is used without loss of generality in the case of 
local sharing. Equivalently, we could assume that each agent has one signal that she shares only with 
others in her community. 
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First, suppose that observation is exogenous. When beliefs are unbounded, meaning 
that a private signal may be arbitrarily informative about the true state, agents can 
almost surely know the state from their observed action sequence if they always observe 
an action that has been taken recently. Moreover, because observation is independent 
from signal value, the stronger notion asymptotic learning can be achieved: not only do 
agents know the true state with near certainty, but their actions converge to the “correct” 
action as well. This result holds regardless of the community size and is consistent with 
existing results in the theoretical literature. 

When beliefs are bounded, coordination motives facilitate better learning. Previous 
research has shown that when there is only one agent in each period (i.e., when the 
community size is one), observation never reveals the truth over time. However, 1 show 
that when the community size becomes sufficiently large, there exists an equilibrium with 
truth-telling observation. In such an equilibrium, an agent may take either of the two 
actions for any possible posterior belief that she has on the true state: given a certain 
range of a signal, she takes her action according to the signal and otherwise acts according 
to the observation. A rough intuition for this result is that when the community size is 
sufficiently large, if all but one agent in a community choose one action, it would be 
optimal for the remaining agent to choose the same action even if it is unlikely to be the 
“correct” action. Hence, even under bounded beliefs, it is possible for all agents to base 
their actions on their signal for a non-zero measure of signals, regardless of when they each 
move in the action sequence. The signal effectively serves as a correlation device, which 
ensures efficient information aggregation from observing the actions of predecessors. As a 
further result, depending on the construction of equilibrium strategies, the probability of 
taking the correct action can become arbitrarily close to 1 at the limit; hence, asymptotic 
learning can be approximated under strong coordination motives even if the most precise 
signal has only limited information value. Indeed, as long as the probability of agents 
acting only according to the signal is positive, observation reveals the truth at the limit. 
Hence, decreasing this probability in turn increases the probability of taking the correct 
action. 

Now suppose that observation is endogenous. An initial observation is that even 
under unbounded private beliefs, asymptotic learning is not achievable: the probability 
of taking the correct action is always bounded away from 1. The reason is that with costly 
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observation, an agent is not willing to observe whenever her signal is snfficiently precise 
bnt still not perfect. I then give a sufficient and necessary condition for truth-telling 
observation: the size of an agent’s observed neighborhood becomes arbitrarily large over 
time. Because of the impossibility of asymptotic learning, any observation of hnitely 
many actions has erroneous implication on the true state with positive probability; this 
probability of error can be eliminated once inhnitely many actions are observed. 

In the presence of coordination motives, the equilibrium learning patterns change sig- 
nihcantly. When the community size is potentially large, an equilibrium emerges with 
asymptotic learning (with unbounded beliefs) or approximate asymptotic learning (with 
bounded beliefs). Such improvement in learning is driven by the possibility of incentiviz- 
ing observation in a large community. For example, imagine a community in which all but 
one agent choose not to observe any action. For the remaining agent, her observation is 
very valuable to both her peers and herself because when agents care about the actions of 
other agents, even a small improvement in learning about the true state brings a consid¬ 
erable increase in all of their payoffs. Moreover, additional incentives for observation can 
be provided by the credible threat of conforming to a sub-optimal action in the absence of 
observation. Following this intuition, I show that there exists an equilibrium in which at 
least one agent always chooses to observe for any value of the private signal. Therefore, 
the argument under exogenous observation can be applied to establish or approximate 
asymptotic learning. This implies that the negative incentive for observation, as induced 
by observation cost, can be eliminated by the marginal beneht of observation under co¬ 
ordination motives. At the same time, efficient information aggregation still exists as 
a result of either unbounded beliefs or a small but positive probability of coordinated 
actions based on signal only. 

One prominent difference arising from the inclusion of coordination motives in the 
model is that multiple equilibria arise in general, in contrast to the generically unique 
equilibrium with singleton communities. In the discussion section, I address the issue of 
equilibrium selection by imposing the criterion of risk dominance. I show that the equilib¬ 
rium in which each agent always maximizes the probability of action matching the state 
is risk dominant, and I reveal that in this equilibrium, stronger coordination motives still 
lead to better learning. Under bounded beliefs, however, the risk-dominant equilibrium 
has different implications: depending on the observation structure, the equilibrium learn- 
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ing probability with coordination motives may be higher, lower or nnchanged relative to 
that with singleton agents. 

The remainder of this paper is organized as follows. Section 2 provides a review of the 
related literature. Section 3 introduces the model. Sections 4 and 5 respectively present 
the main results under exogenous observation and endogenous observation. Section 6 
discusses some additional features and extensions of the model. Section 7 concludes the 
paper. All proofs are included in the Appendix. 

2 Literature Review 

A large and growing body of literature examines the problem of social learning by 
Bayesian agents who can observe others’ choices. This literature begins with Bikhchan- 
dani, Hirshleifer and Welch[7] and Banerjee[5], who hrst formalize the problem systemat¬ 
ically and concisely and identify information cascades as the cause of herding behavior. 
In their models, the informativeness of the observed action history outweighs that of 
any private signal with a positive probability, and herding occurs as a result. Smith and 
Sorensen [33] propose a comprehensive model of a similar environment with a more general 
signal structure. They show that signal strength plays a decisive role in social learning 
in the sense that the possibility of arbitrarily strong signals is necessary and sufficient for 
asymptotic learning in their framework. The concepts of bounded and unbounded private 
beliefs introduced by those authors will play an important role in the remainder of the 
current paper. These seminal papers, along with the general discussion by Bikhchandani, 
Hirshleifer and Welch[8], assume that agents can observe the entire previous decision his¬ 
tory, i.e., the whole ordered set of choices of their predecessors. This assumption can be 
regarded as an extreme case of exogenous observation structure. Related contributions to 
the literature include Lee[29], Banerjee[6] and Celen and Kariv[12], where agents observe 
only a given fraction of the entire decision history. 

A more recent paper by Acemoglu et al. [1] studies the environment in which each 
agent receives a private signal about the underlying state of the world and observes (some 
of) their predecessors’ actions according to a general stochastic process of observation. 
Their main result states that when the private signal structure features unbounded belief, 
asymptotic learning occurs in each equilibrium if and only if the observation structure 
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enables agents to always observe some close predecessor. Other recent research in this 
area inclnde the works of Banerjee and Fndenberg[4], Gale and Kariv[22], Callander 
and Horner [10] and Smith and Sorensen[34], which differ from Acemogln et ah[l] mainly 
in making alternative assnmptions regarding observation, e.g., agents observe only the 
nnmber of other agents taking each available action bnt not the positions of the observed 
agents in the decision seqnence. 

Two common assnmptions made in the abovementioned literatnre are exogenons ob¬ 
servation and pnre informational externalities; according to the latter, an agent cares 
only abont taking the correct action, and her payoff is not directly affected by others’ 
actions. The literatnre exploring the relaxation of either of these assnmptions is relatively 
nnder-developed. A few recent papers initiated the discussion on the impact of costly 
observations on social learning. In Kultti and Miettinen[27][28], both the underlying 
state and the private signal are binary, and an agent pays a cost for each action that she 
observes. In Celen[ll], the signal structure is similar to the general one adopted in this 
paper, but it is assumed that an agent can pay a cost to observe the entire history of 
actions before hers. A much richer model is given by Song[35], as it allows for the most 
general signal structure as well as the possibility that agents would need to strategically 
choose a proper subset of their predecessors’ actions to observe. A major implication 
from these works is that the existence of observation costs prevents asymptotic learn¬ 
ing, although it may increase the informativeness of an observed action sequence because 
agents will sometimes rationally choose not to observe and rely on their signal. 

The theoretical literature on the interplay between information cascades and coordina¬ 
tion motives is also rather small. Moreover, the few existing papers often differ from one 
another in important aspects, such as the payoff function, the sequence of moves and the 
information update process (see, e.g., Choi[13], Dasgupta[15], Jeitschko and Taylor [26], 
Frisell[21], Vergari[36]). However, there is also a small group of experimental studies of 
information cascades and payoff externalities (see, e.g.. Hung and Plott[24], Drehmann et 
al.[18]). The major results of those studies suggest a learning pattern that is consistent 
with this paper: when agents care about the actions of one another beyond the infor¬ 
mational externalities, they are both more likely to conform and more likely to take the 
“correct” action. Informational herding is thus reduced. 

This paper can be positioned in line with the works of Bikhchandani, Hirshleifer and 


Welch[7], Smith and Sorensen[33], Acemoglu et al. [1] and others in the sense that I adopt 
the general signal structure and the sequential decision process developed in these models. 
Nevertheless, this paper differs from the previous research in two important aspects. First, 
instead of assuming an exogenous observation structure, I allow observation to occur as 
part of an agent’s strategic decision. Second, in addition to informational externalities, my 
model also features payoff externalities: the more agents take the same action, the higher 
the payoff each agent enjoys. As shown subsequently in this paper, these assumptions 
not only are more realistic in most applications but also have significant impacts on the 
equilibrium learning pattern. 

In this paper and in most of the cited theoretical papers above, agents are assumed 
to update their beliefs according to Bayes’ rule. There is also a well-known body of 
literature on non-Bayesian observational learning. In these models, rather than applying 
Bayes’ update to obtain the posterior belief regarding the underlying state of the world 
by using all the available information, agents may adopt some intuitive rule of thumb to 
guide their choices (Ellison and Fudenberg[19][20]), may update their beliefs according 
to only part of their information (Bala and Goyal[2][3]), may naively update their beliefs 
by taking weighted averages of their neighbors’ beliefs (Golub and Jackson[23]), or may 
be subject to bias in interpreting information (DeMarzo, Vayanos and Zwiebel[16]). 

Finally, the importance of observational learning has been well documented in both 
empirical and experimental studies, in addition to those already mentioned. Both focusing 
on the adoption of new agricultural technology, Gonley and Udry[14] and Munshi[31] not 
only support the importance of observational learning but also indicate that observation 
is often constrained because, in practice, a farmer may be unable to receive informa¬ 
tion regarding the choice of every other farmer in the area. Munshi[30] and loannides 
and Loury[25] demonstrate that social networks play an important role in individuals’ 
acquisition of employment information. Gai, Ghen and Fang[9] conduct a natural held 
experiment to indicate the empirical signihcance of observational learning in which con¬ 
sumers obtain information about product quality from the purchasing decisions of others. 
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3 Model 


3.1 Private Signal Structure 

Consider a discrete and infinite time line: t = 1,2,.... At each period t, there is a set of 
agents that move simultaneously. We refer to as a community. The community 
size Q* = |A^^| is randomly selected from a commonly known probability distribution G 
on M’*', with the largest community size in the support being finite. The values are 
independent and identically distributed (i.i.d.) over time. is common knowledge for 
agents in N^', t' > t, at the beginning of period t'. 

Let 6 G {0,1} be the state of the world with equal prior probabilities, i.e., Prob{6 = 
0) = Prob{9 = 1) = |. Given 9, an i.i.d. private signal s^{Q^) E S = (—1,1) is realized 
in period t after the realization of which is observed by every agent in and by 
no one else. The s^{Q^) values are independently distributed, but their distributions can 
be heterogeneous depending on QL One interpretation of this setting is that each agent 
receives and shares a signal with the community; the more agents there are, the more 
precise the aggregated information about the true state is. 

The probability distributions regarding the signal conditional on the state are denoted 
as Fq{s) and Fi{s) (with continuous density functions f^is) and f^{s)), where Q de¬ 
notes the community size. The pair of measures are referred to as the 

signal structure, and I assume that the signal structure has the following properties for 
every Q\ 


1. The pdf’s /(^(s) and fi{s) are continuous and non-zero everywhere on the 
support, which immediately implies that no signal is fully revealing of the underlying 
state. 


2. Monotone likelihood ratio property (MLRP); is strictly increasing in s. This 
assumption is made without loss of generality: as long as no two signals generate 
the same likelihood ratio, the signals can always be re-aligned to form a structure 
that satisfies the MLRP. 


The focus of this paper is to examine the interaction among signal, observation and 
externalities and to identify conditions that need to be imposed on each factor to ensure 
the highest possible level of learning. To address this issue and present the major findings, 
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it is useful to first introduce a notation that categorizes the signal structure. The private 


belief of an agent is dehned by the probability of the true state being 1 according to her 


signal only, and it is given by 

f^(s) 

Definition 1. We say that agents have unbounded private beliefs z/linis^i q \ q = 1 
and hms_j._i = 0 for some Q on the support of distribution G. We say that 

(s) (s) 

agents have bounded private beliefs if hmo^.i l ^ < 1 o-nd hmo^._i \ o, . > 0 
for every Q on the support of distribution G. 


Unbounded private beliefs correspond to a situation in which a community may receive 
an arbitrarily strong signal about the underlying state, while bounded beliefs indicate 
that the amount of information that can be derived from a single private signal is always 
limited. 


3.2 The Sequential Decision Process 

The agents in each take a single action simultaneously between 0 and 1. Let G {0,1} 
denote the action of agent n in 

Agent n cares about the action of every agent in N^. Given {aj : i G A^*}, the payoff 
of agent n is equal to u{9, a^, m) > 0, where m is the number of actions in {aj : i G A^*} 
that are the same as a^. I make the following assumptions about u: 

1. Given every 6 , a^, u is increasing in m. This assumption means that every agent 
prefers that more of her peers take the same action as she does. 

2. Given every m, u(0,0,m) = u(l,l,m) > u(l,0,m) = u(0,l,m). This assump¬ 
tion means that every agent prefers to take the “correct” action, i.e., the action 
that matches the state. 

3. When m is sufficiently large, u{a, b, m) > u{a, 1—b, 1), a, 6 G {0,1}. This assump¬ 
tion means that coordination motives (conforming to the majority) can dominate 
information motives (matching the state) in a large community. 

The direct influence of every agent’s action on the payoffs of other agents within 
the same community differentiates this model from most theoretical literature on social 
learning. In addition to the widely studied informational externalities that arise from 
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sequential observation, there now exists a new parallel economic force, coordination mo¬ 
tives, that generates an incentive for an agent to conform with her peers. This incentive 
becomes stronger as the community size increases. The primary goal of this paper is 
to ascertain how this incentive affects individual behavior as well as the overall learning 
level and to determine whether it improves or impairs the likelihood of agents taking the 
correct action over time. 

After receiving signal s*(Q*) and before engaging in the above action, the agents may 
observe some of the actions taken by their predecessors. In this paper, I will discuss two 
possible structures of observation. 

3.2.1 Exogenous observation 

The agents in observe the ordered action sequence in a neighborhood C : 

M C - (each agent in observes the same action sequence). The 

neighborhood is generated according to a probability distribution over the set 
The draws from each are independent from one another for all t 
and from the realization of community size and private signals. Let & = 
be the union of all possible neighborhoods that can be observed in period t. The se¬ 
quence {0*}ieN+ is called the observation structure and is common knowledge, while the 
realization of s* and B^ are known by agents only in N^. 

Let = {am G {0, 1} : m G B^,B^ C B*^} denote the set of actions that n can 
possibly know from observation, and let h* be a particular action sequence in Let 
jt _ /i*} be n’s information set. Note that the information set of every agent in 

is the same. The set of all possible information sets of n is denoted as XL 
A strategy for n is a mapping : X* —)■ {0,1} that selects a decision for every 
possible information set. A strategy profile is a sequence of strategies = {^*}fGN+ = 
im I use to denote the strategies of all agents other 

than n in period f, cf-t = to denote the strategies of all agents other than those 

in and fi-ny = denote the strategies of all agents other than n. 

Given a strategy prohle, the sequence of decisions {a^jneN is a stochastic process. I 
denote the probability measure generated by this stochastic process as V^. 
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3.2.2 Endogenous Observation 

The agents in iV* simnltaneonsly acqnire information abont the previons decisions of 
other agents throngh observation. Each agent n can pay a cost c > 0 to obtain a capacity 
K(t) G N"*"; otherwise, he pays nothing and chooses 0. 

With capacity K{t), agent n can select a neighborhood B^{n) C of maximnm 

size K{t)^ i.e., |5*(n)| < K{t), and observe the action of each agent in B^{n). The 
actions in B^{n) are observed simnltaneonsly, and no agent can choose any additional 
observation based on what she has already observed. Let B^{n) denote the set of all 
possible neighborhoods that n can observe. After the agents make their decisions regard¬ 
ing observation, the actions that they choose to observe are revealed and become pnblic 
information within N^. That is, every agent in W observes B^ = U^^iB^{n). 

An agent’s strategy in the above seqnential game consists of two problems: (1) given 
her private signal, whether to make costly observation and, if so, whom to observe; 
(2) after observation (or not), which action to take between 0 and 1 given the realiza¬ 
tion of observed actions. With some abuse of notation, let = {om G {0,1} : m G 
B C \B\ < Q^K{t)} denote the set of actions that n can possibly know from 

observation by herself and others, and let h^ be a particular action sequence in H^. 
P = {s*((5*), h*} and X* are dehned similarly as that used above. 

A strategy for n is the set of two mappings a* = where : S —?■ B^{n) 

selects n’s choice of observation for every possible private signal and : X* —)■ {0,1} 

selects an action for every possible information set. A strategy profile is a sequence 
of strategies a = {a*}teN+ = ,Q*}}iGN+- I use the notation 

cr_f = and in a manner similar to that used for exogenous 

observation. 

Given a strategy prohle, the sequence of decisions is a stochastic process. I 

denote the probability measure generated by this stochastic process as Va- 

A decisive difference between exogenous and endogenous observation lies in how ob¬ 
servation correlates with signal. Under exogenous observation, no correlation between 
signal and observation exists because they are simply two independent processes. Un¬ 
der endogenous observation, however, observation-whether to observe and, if so, whom 
to observe-may depend on the value of the private signal because it is now part of an 
agent’s optimal decision. Conceivably, for an agent who attempts to extract information 
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about the true state from her observation, her inference on private signals and the ob¬ 
servation of her predecessors, which then partially determines her posterior belief on the 
state, will be formed very differently under the two observation structures. As shown in 
subsequent sections of the paper, observation structure has a signihcant impact on the 
pattern of social learning. 

3.3 Perfect Bayesian Equilibrium 

Definition 2. A strategy profile a* (resp. <f>*) is a pure strategy perfect Bayesian 
equilibrium (PBE) if, for each t G and n G {I,-- - ,Q^}, is such that given 
o'fnt! (V (i^Gsp. maximizcs the expected payoff of n given every P G X* 

and (2) maximizes the expected payoff of n, given every and given 

Whether observation is exogenous or endogenous, the idea underlying PBE is simi¬ 
lar: given all available information and the strategy of each predecessor and each peer, 
an agent determines her payoff-maximizing strategy. In a model without coordination 
motives, this strategy always coincides with the strategy that maximizes the probability 
of taking the correct action, but in this situation, it may not because the actions of one’s 
peers must also be considered. An equilibrium strategy under endogenous observation 
differs from one under exogenous observation in its additional component of observation 
choice after receiving the private signal. In such a case, an agent optimizes her observation 
according to her signal value and others’ strategies. 

Throughout the remainder of the paper, I simply refer to PBE as “equilibrium”. 

Proposition 1. In every equilibrium a* (resp. f*) and for every t, actions are always 
unanimous in : for every P, = aff‘^{P); for every m,n ^ . 

Proposition 1 indicates an agent’s incentive to conform to her peers in the same com¬ 
munity. Note that the posterior belief on the true state is the same across the community, 
and consider the two sub-groups of agents choosing different actions. If an agent choosing 
action 1 weakly prefers 1 to 0, then each agent choosing 0 must strictly prefer 1 to 0. 
This is a contradiction, and hence, the only equilibrium action prohle is nnanimous. This 
result initially seems to indicate that coordination motives always exacerbate herding 
and are harmful for learning because there is now an additional incentive to ignore one’s 

signal and submit to the majority. However, this result also implies that agents in a 
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community may conform to an action profile that depends on their signal rather than on 
observation, such that their actions become more informative for successors. As will be 
shown later, such behavior indeed improves social learning to a great extent. 

Notably, indifference between the two actions can exist in a mixed strategy equilibrium. 
In fact, when the community size is large, there always exists a mixed strategy equilibrium 
in which an agent’s probability of mixing between 1 and 0 depends on the signal value. 
However, because the mixed strategy equilibrium does not provide additional insight into 
the relation between social learning and coordination motives, I will not discuss it in 
detail in this paper. 

3.4 Learning 

The main focus of this paper is to determine what type of information aggregation will 
result from equilibrium behavior. First, I dehne the different types of learning studied in 
this paper. 

Definition 3. An equilibrium a* (resp. (p*) has asymptotic learning if every agent 
takes the correct action at the limit: 

lim Va* (a^ = 9) = 1 for all n. 

In this paper, the unconditional probability of taking the correct action, Va*{al^ = 9), 
is also referred to as the learning probability. Asymptotic learning requires that this 
probability converge to 1, i.e., the posterior beliefs converge to a degenerate distribu¬ 
tion on the true state. In terms of information aggregation, asymptotic learning can be 
interpreted as equivalent to making all private signals public and thus aggregating infor¬ 
mation efficiently. It marks the upper bound of social learning with any signal structure 
and observation structure. 

Asymptotic learning may not always be achieved, especially under an endogenous 
observation structure, because a rational agent may choose not to make costly observa¬ 
tions when her signal is already quite precise. In such a case, it is still interesting to 
determine whether information can be efficiently aggregated via observation, i.e., to ask 
the following question: when an agent decides to observe, will her observation reveal the 
truth and lead her to act correctly? A formal analysis calls for the notion of truth-telling 
observation, which is dehned below. 
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Let d* be a hypothetical action that is equal to the state with higher posterior prob¬ 
ability given any P. 

Definition 4. An equilibrium a* (resp. 4>*) has truth-telling observation if (A = 9 
whenever observation is non-empty at the limit: 

lim P,*(d* = e\B^ ^ 0) = 1. 

t—^OO 

Truth-telling observation is a weaker condition than asymptotic learning in two as¬ 
pects. First, it requires only the state-matching action d* to be perfectly correct con¬ 
ditional on non-empty observation as t ^ oo, as opposed to the unconditional correct 
action in asymptotic learning. Second, even in an equilibrium with truth-telling obser¬ 
vation, an agent’s action conditional on non-empty observation may not coincide with 
d*. This stems from coordination motives: when the community size is large, the agents 
may conform to an action that matches the state with a probability lower than In 
contrast, asymptotic learning requires each agent’s equilibrium action to be always the 
same as d* at the limit. Therefore, truth-telling observation should be regarded as a no¬ 
tion describing only the maximum informativeness of observation but not the correctness 
of equilibrium behavior, while asymptotic learning represents the highest level of both. 

4 Results for Exogenous Observation 

In this section, I present the main results for exogenous observation. A well-established 
theoretical prediction in much of the literature, which typically assumes that only one 
agent moves in each period, is that herding occurs when private beliefs are bounded: 
with a positive probability, all agents ultimately choose the wrong action after a par¬ 
ticular time threshold. This occurs because learning cannot be improved indehnitely: 
at a certain point, some agent’s observation becomes so informative that she abandons 
her private signal altogether and herds with her predecessors, and by this time, social 
learning essentially ceases. In this section, I will demonstrate how coordination motives 
can prevent herding, can incentivize agents to use their private information, and can lead 
to a learning level that can be arbitrarily close to asymptotic learning. 
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4.1 Two Conditions on Observation Structure 


When observation is exogenous, the observation structure-indicating which predecessors 
each agent observes-plays an important role in determining whether asymptotic learning 
is possible. This structure is sometimes referred to as a network in the literature to high¬ 
light the connection between theory and application. To illustrate how the observation 
structure influences learning, I provide several typical observation structures below. 

1. for all t: a “star network” in which each agent observes only the actions 
in the first community in the action sequence. 

2. a “line network” in which each agent observes only the closest 
community. 

3. B* = a “complete network” in which each agent observes every prede¬ 

cessor. This is the upper bound of observational information that can be obtained. 

In the first structure, asymptotic learning is never possible regardless of private beliefs 
and community size because learning cannot be improved beyond the second period: all 
agents after period 1 are essentially identical in terms of information acquired because 
they have identical observation. In the second and third observation structures, if each 
community is a singleton, then asymptotic learning occurs when private beliefs are un¬ 
bounded but never occurs when private beliefs are bounded. Herding occurs in the latter 
case with a probability bounded away from zero. 

I now introduce two conditions on the observation structure that lead to approximate 
asymptotic learning in the presence of coordination motives, regardless of whether private 
beliefs are unbounded. The first condition stands in contrast to the “star network” above. 

Definition 5. An observation structure has expanding observations if, for every 
K G N+, limt^oo 0*(max{r : Q" C H*} <K) = 0. 

The concept of expanding observations for singleton communities was first introduced 
by Acemoglu et al.[l], and in my paper, I generalize this concept to the case with non¬ 
singleton communities. This approach implies that an agent always observes a predecessor 
who is not too far away. The “star network” clearly does not have expanding observa¬ 
tions, but the “line network” and “complete network” do. This property on observation 
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structure ensures that if any improvement on learning is ever possible, it is transmitted 
to every agent over time via observation in the sense that no agent will be blocked from 
any recent development in learning by only observing some distant predecessors. 

However, expanding observations alone is not sufficient for asymptotic learning when 
private beliefs are bounded. For instance, the “line network” has expanding observations, 
but with bounded private beliefs, herding occurs with positive probability regardless of 
how large the community becomes. Therefore, I introduce the second condition. 

Definition 6. An observation structure has infinite complete observations if there 
exists a infinite subset of time periods {ti,t 2 ,-''} such that (1) for every K G N"*", 
hm„^.oo < K) = t), and (2) hm„^.oo C ViV"” C H*") = 1. 

The meaning of inhnite observations is straightforward. Complete observations indi¬ 
cate the existence of a subset of agents such that an agent in this subset who observes a 
predecessor also observes all actions that can possibly be observed by the predecessor. At 
the limit, an agent’s observed neighborhood contains inhnitely many actions. To under¬ 
stand why this condition is needed for approximate asymptotic learning with bounded 
private beliefs, hrst consider an observation structure that has only hnite observations, 
such as a “line network”. On the one hand, bounded private beliefs prevent private infor¬ 
mation within any hnite neighborhood from being arbitrarily informative about the true 
state. On the other hand, hnite observations and bounded private beliefs together imply 
that once any observation by any predecessor in this neighborhood becomes sufficiently 
informative, it cannot be improved upon by substituting the predecessor’s action. To 
approach asymptotic learning, the only possibility is to have inhnitely many actions in 
an observed neighborhood. 

Next, consider the case in which an agent has an “incomplete” observation of a prede¬ 
cessor. Hence, there are some actions that cannot be observed by the agent but that may 
have been observed by the predecessor. Now the predecessor’s action has an ambiguous 
ehect on the agent’s posterior belief because the agent needs to consider those unobserved 
actions and make a corresponding inference based on her observation. As a result, the 
direction of her posterior belief-whether it favors state 0 or 1 after observing the predeces¬ 
sor’s action-cannot be determined solely by the predecessor’s action but depends on her 
observation. In other words, observing action 1 by the predecessor may cause the agent 

to favor either state in different situations, which makes her updated belief intractable. 
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In contrast, complete observations determine the direction of the agent’s posterior belief 
without ambiguity. I provide further detail below when presenting the formal result. 

A simple example of an observation structure with both expanding observations and 
inhnite complete observations is the “complete network” in which every agent observes 
the entire action history. In general, the two conditions are satished by a wide class of 
observation structures. 

4.2 Main Result 

I now present the main theoretical result of this section. 

Theorem 1. Assume that the observation structure has expanding observations and in¬ 
finite complete observations. There exists Q such that if G{Q > Q) > 0, for every 
e > 0, there exists an equilibrium such that (1) truth-telling observation occurs and 
(2) linit^oo (On = 6*) > 1 - e. 

This result shows that coordination motives can serve as an economic force that 
counters the herding incentive in a way that hurts individual agents ceteris paribus but 
benehts social learning. When the community size is large, the signal can be regarded 
as a correlating device to coordinate agents in the same community to conform to an 
action based on the signal value alone. This action may sometimes differ from the more 
“informed” action based on both signal and observation, but it does constitute mutual 
best responses and makes the actions of this community informative for successors. This 
is the key difference between a model with coordination motives and a model without 
them: in the latter, because every agent always seeks to maximize her probability of 
matching the state, herding can never be prevented when private beliefs are bounded. 

Following the rough intuition above, I present a heuristic proof of Theorem 1 (the 
complete proof with technical details can be found in the Appendix). First, properties of 
Bayes’ update determine that regardless of which is the true state, an agent’s posterior 
probability on the wrong state can never become arbitrarily close to 1 over time because 
otherwise the same set of observation inducing this posterior probability would need to 
occur with > 1 probability when the true state is altered, which is a contradiction. 

Next, I construct an equilibrium in which each observed action is informative. Con¬ 
sider an action prohle that follows observation-that is, choose the action matching the 
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state with higher probability given observation only-when the signal is weak and that fol¬ 
lows the signal when the signal is strong. This sitnation constitntes mntnal best responses 
when is large because the incentive to conform becomes stronger than the incentive 
to match the state. In this equilibrium, strong private signals are never abandoned. As 
a result, for an agent who has complete observation of another agent following such an 
action prohle, Bayes’ update from observing this additional action will induce a posterior 
belief in favor of the corresponding state, in contrast to the belief that occurs without 
adding this observation. This claim implies a more important property of equilibrium 
behavior: following any belief about the state, additional observation of sufficiently many 
actions of the same value can induce a new belief that entails a higher (> |) probability 
of the corresponding state. 

Now we can demonstrate the truth-telling nature of observation. Consider a subset of 
agents with inhnite complete observations, and note that the hypothetical action a* can 
be regarded as the optimal action for some outside singleton agent who observes and 
attempts to maximize her probability of matching the state. Suppose that truth-telling 
observation does not occur, which implies that her highest learning probability is equal to 
some p < 1. Fix a sufficiently large t' such that observing B^' gives her a ~ p probability 
of matching the state, and consider another sufficiently large number A and the following 
sub-optimal strategy: given the action sequence in B^', she will change her action if and 
only if she observes A consecutive additional actions that are the same value, which is 
opposite of the action that she would have taken by observing only B^'. It can be shown 
that this sub-optimal strategy already improves her learning probability by a signihcant 
amount, which makes the total probability exceed p-thus revealing a contradiction. No¬ 
tably, the result is not obtained by the law of large numbers, as observed actions are 
not mutually independent: later actions are affected by earlier actions via observational 
learning. Instead, this strict improvement stems from calculating the difference between 
the probabilities of the A actions being “helpful” (in the sense that they correct a wrong 
belief) and “harmful” (in the sense that they mislead from a correct belief); details are 
provided in the Appendix. 

Finally, I identify a direct inverse relation between the limit learning probability and 
the probability of agents acting according to signal only. Truth-telling observation implies 
that at the limit, the probability of taking the correct action conditional on non-empty 
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observation is equal to 1; hence, the total learning probability at the limit is the sum of 
the probability that agents consider their observation and the probability that a strong 
signal occurs favoring the true state. The cutoff for a strong signal is arbitrary-as long as 
each agent uses her signal for a hxed positive probability, truth-telling observation occurs. 
Hence, the higher this cutoff is, the more likely an agent chooses her action according to 
observation, and thus, the higher the learning probability is. In this way, any learning 
probability that is less than 1 can be obtained in equilibrium. 

Note that the condition of inhnite complete observations is not required for asymp¬ 
totic learning when private beliefs are unbounded. This has been proved for singleton 
communities in the literature (see, e.g.. Smith and Sorensen[33]) and is extended to this 
model with coordination motives. I note this result below and use it in subsequent proofs. 

Proposition 2. Assume that the observation structure has expanding observations and 
that private beliefs are unbounded. There always exists an equilibrium with asymptotic 
learning. 

5 Results on Endogenous Observation 

In this section, I analyze the model under endogenous observation by discussing the cases 
of unbounded and bounded private beliefs separately. Note that costly and strategic 
observation creates an independent economic force by itself; it discourages an agent from 
observation when her signal is informative because the additional beneht from observation 
becomes small or even negligible. With this added strategic component, the effect of 
coordination motives becomes more subtle, but in general, a similar implication can be 
derived: with sufficiently strong coordination motives, the level of social learning can be 
improved. 

5.1 Unbounded Private Beliefs 

5.1.1 Singleton Communities 

To fully understand how coordination motives change the pattern of learning, it is impor¬ 
tant to hrst understand how singleton agents-that is, G(l) = 1-behave when observation 
is endogenous; this behavior has scarcely been explored in the previous literature. First, 
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I show that asymptotic learning never occurs in this environment. 


Theorem 2.A (Song (2015)). Asymptotic learning does not oeeur in any equilibrium. 


Although the result stands in stark contrast to the well-known result in the literature 
with exogenous observation, the underlying argument for this result is rather straightfor¬ 
ward. By assumption, no private signal perfectly reveals the true state; for asymptotic 
learning to occur in any equilibrium, a necessary condition is that an agent chooses 
to observe for almost every private signal. If an agent chooses to observe, her pay¬ 
off is upper bounded by n(l, 1,1) — c because the best possible observation is one that 
fully reveals the true state and guarantees her a beneht of n(l,l,l). If she chooses 
not to observe and simply follows her private signal s, her expected payoff is equal to 
1,1) + (1 — 0,1). Because private beliefs are un¬ 

bounded, for any positive c, there is always a positive measure of signals such that the 
payoff from no observation is higher, which implies that asymptotic learning never occurs. 

Although asymptotic learning is impossible, efficient information aggregation can still 
be achieved in the form of truth-telling observation. Assuming a symmetric signal struc¬ 
ture, the following result provides a necessary and sufficient condition for truth-telling 
observation in this environment as well as a full characterization of the limit learning 
probability. 


Theorem 2.B (Song (2015)). Assume that the signal structure is symmetric: /d(s) = 
fi{—s) for every s ^ S. Truth-telling observation occurs in a* if and only z/limt^oo K{t) = 
oo. \im.t^oaVcr*[af = 9) = Ff{s*) where s* is characterized by 


ms*) 


fois*) + fl{s*) 


1 ) 1 ) + 


ms*) 


fois*) + fl{s* 


-n(l,0,l) 


-c. 


The property of truth-telling observation also holds in the case of exogenous obser¬ 
vation with unbounded private beliefs and expanding observations, but the underlying 
mechanism here is much different. Under exogenous observation, an agent always uses her 
private information with a positive probability (which converges to 0 over time) because 
her signal can be strong enough to overwhelm the realized observation. Under endogenous 
observation, an agent may choose to use her private information and not observe at all 
because although observation can still be benehcial, its marginal beneht in information 
does not cover the cost. This probability of no observation does not converge to 0 over 
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time. As a result, an agent’s individual action is always erroneous with a probability 
bounded away from 0, which then implies that observing any hnite sequence of actions 
does not reveal the true state regardless of when the actions occurred. In other words, 
truth-telling observation never occurs when lim^^oo -^(^) 7^ cxo- However, this individual 
error is precisely the source of informativeness: because an agent sometimes chooses to 
forgo the (potentially more informative) observation, her action is indicative of the range 
of signals that she receives. Therefore, once an agent observes an arbitrarily large neigh¬ 
borhood, information can be aggregated efficiently to reveal the true state. Once again, 
this follows not from the law of large numbers but from an argument of continuing strict 
improvement similar to that in Theorem 1. 

In terms of the limit learning probability, it is straightforward that T’o^(s*) is the 
largest possible learning probability in equilibrium, and it is achievable only when truth¬ 
telling observation occurs. After all, it is impossible in any equilibrium for any agent to 
choose to observe when her signal is not in [—s*, s*]. Hence, we can conclude that with 
unbounded private beliefs, endogenous observation lowers the limit learning probability 
compared with the most informative scenario under exogenous observation. However, 
endogenous observation may lead to a higher limit learning probability than exogenous 
observation without expanding observations because although agents will not observe 
the given extreme signals, they make more informed choices when they do observe. For 
instance, consider the “star network” in the previous example of observation structures. It 
can be shown that if observation is endogenous and K{t) = 1, each agent will observe her 
immediate predecessor whenever she chooses to observe, and the limit learning probability 
is higher than that in the “star network” when c is low. 

5.1.2 Non-Singleton Communities 

In this section, I present the main result for non-singleton communities and compare it 
with the result for singleton communities above. 

Theorem 3. There is a cutoff c > 0 such that for every c G (0,c), there exists Q{c) such 
that if G{Q > Q(c)) = 1, there exists an equilibrium a* with asymptotic learning. 

Before elaborating on this result, I must hrst describe such an equilibrium that leads 
to asymptotic learning. For agent in the same community N^, consider two action prohles: 

a “truth-seeking” prohle in which agents conform to the action that matches the state 
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with higher probability according to all available information and a sub-optimal profile in 
which they act otherwise-for example, they conform to the action that is the worst match 
for the state. The first profile clearly yields a higher payoff for every agent in expectation. 
Now consider the following strategy profile for observation and action; agent 1 observes 
a prescribed neighborhood given and no other agent observes. If the realized 

observed neighborhood is revealed to be the same as the prescribed one, then the agents 
follow the “truth-seeking” action profile; otherwise, they follow the sub-optimal prohle. 

When the community size is large, both action prohles constitute best responses, 
which then implies (by backward induction) that making the prescribed observation is 
indeed optimal for agent 1. Hence, we have an equilibrium in which the observation of 
an arbitrary non-empty neighborhood occurs regardless of signal value. By imposing the 
property of expanding observations on this sequence of the observed neighborhood (for 
example, agent 1 in each period observes agent 1 in the previous period), we can apply 
Proposition 2 to obtain asymptotic learning. 

This result identihes an effect on strategic observation that is imposed by coordination 
motives: more observation can be encouraged as the community size grows. In the 
equilibrium described above, by conforming to different actions according to the observed 
neighborhood, the agents essentially make it more costly for agent 1 not to observe, and 
thus, the range of signals for which agent 1 will observe the prescribed neighborhood 
expands. When the community size becomes sufficiently large, this signal range becomes 
the whole support S, and hence, an unbroken chain of observation is established even 
when observation is costly. As a result, the efficient aggregation of information is restored. 

From the construction of equilibrium, we can also see that the result is robust to 
the specihc cost structure of observation. In a more general model, let c^{k) denote a 
cost function for observing k predecessors in period t. As long as c*(l) has a constant 
upper bound. Theorem 3 can be applied to show that asymptotic learning can occur in 
equilibrium. 

5.2 Bounded Private Beliefs 

When private beliefs are bounded and only a hnite-size neighborhood can be observed 
at the limit, the level of social learning is always bounded away from 1 because of ei- 
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ther herding or a persistent probability of error^. Therefore, in this section, I assume 
that \imt^ooK{t) = oo to show a sharp contrast between learning with and without 
coordination motives. 


As in the previous section, I hrst discuss the effect of endogenous observation on 
learning in an environment in which agents are singletons. The limit learning probability 
can be affected in either direction: whether it rises or falls compared with exogenous 
observation depends greatly on the value of c, the cost of observation. The following 
example illustrates this result and its underlying mechanism without loss of generality. 

Assume that only one agent moves in each period. Consider the following two cases: 
exogenous observation where — 1} and endogenous observation where 

K{t) = t — 1. It is established in the literature (see, e.g.. Smith and Sorensen[33]) that 
when observation is exogenous, the limit learning probability has an upper bound P < 1. 
In other words, at the limit, an agent behaves better than merely following her own signal, 
but she cannot learn the true state perfectly. 

Under endogenous observation. Theorem 2 can be extended here to characterize the 
limit learning probability for a range of the cost c. Consider a symmetric signal structure. 
Note that unbounded private beliefs constitute a sufficient but not necessary condition 
for the proof of Theorem 1. In fact, truth-telling observation requires only that beliefs be 
“strong” relative to cost, i.e., lim^^i x ^(1,1,1) + lim^^i x ^(1, 0,1) > 

n(l,l,l) — c. In other words, as long as an agent prefers not to observe-even if ob¬ 
servation reveals the truth-when her signal takes the most extreme value, the neces¬ 
sary and sufficient relation between truth-telling observation and inhnite observation 
at the limit can be derived following the same argument as used previously. Hence, 
when c > x (m(1, 1,1) — m(1,0,1)), letting s(c) be characterized by 

1,1) 7r7774CTiy77prM(l, 0,1) = m( 1, 1,1) - c, we have an expression for 


the limit learning probability that is denoted P{c): 


P{c) = F^{s{c)). 


Depending on c, the value of s(c) ranges from 0 to arbitrarily close to 1. As a 
result, the value of P{c) ranges from Po^(O) to arbitrarily close to 1. We see here that 
endogenous observation affects social learning in a way that is monotonic in c: compared 
^This claim is valid for both exogenous and endogenous observation. Formal results can be found in 
Song [35]. 
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with exogenous observation, endogenous observation is better for social learning when c 
is relatively large and is worse for social learning when c is relatively small. 

I now present the main result for coordination motives. Regardless of the value of c, 
coordination motives facilitate learning in the sense that it increases the highest possible 
equilibrium learning probability. 

Theorem 4. There is a cutoff c > 0 such that for every c G (0,c), there exists Q{c) 
such that if G{Q > Q(c)) = 1, for every e > 0, there exists an equilibrium a* where (1) 
truth-telling observation occurs and (2) = 6) > 1 — e. 

This result can be derived from a combination of Theorems 1 and 3. First, based 
on Theorem 3, agents can be incentivized to observe a prescribed neighborhood given 
any signal; then, according to Theorem 1, when the prescribed neighborhood is observed, 
the signal serves as a correlation device for the agents to coordinate on an action prohle 
that accounts for all available information with a certain probability. This probability 
can be arbitrarily close to 1. Consequently, for any hxed observation cost c, when the 
community size is large, there is always an equilibrium with a higher learning probability 
than is available for singleton communities. 

5.3 Summary 

Before discussing some extensions of the model, I briefly summarize the comparison across 
observation structures and community sizes in this section. To introduce a different 
and useful perspective for examining the impact of various factors on social learning, I 
categorize the main results here by signal structure and regard the case with exogenous 
observation and singleton communities as a benchmark. 

When private beliefs are unbounded, in the benchmark case, the level of social learning 
depends entirely on the pattern of observation. Asymptotic learning occurs if and only 
if at the limit an agent almost surely observes a close predecessor (e.g., the “complete” 
network). The presence of coordination motives does not change this property of learning. 
When observation becomes endogenous, asymptotic learning cannot be achieved because 
the positive observation cost prevents an agent from observing when her signal is strong. 
Imposing coordination motives now makes a difference in the sense that it encourages 
observation and thus restores asymptotic learning when the community size is sufficiently 
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large. Figure 1 uses some representative observation structures to illustrate the learning 
pattern over time in different environments. 

Limit learning prol'ialiility 



Figure 1: Learning Patterns with Unbounded Private Beliefs 
(En = endogenous observation; Ex = exogenous observation) 


When private beliefs are bounded, the benchmark case typically produces a learning 
probability bounded away from 1, regardless of whether agents observe close or distant 
predecessors. Making observation endogenous can cause this probability to be either 
higher or lower depending on the observation cost c. With coordination motives, the 
highest possible learning probability increases for any value of c when the community 
size is sufficiently large; in particular, it can be arbitrarily close to 1 in equilibrium. 
Figure 2 illustrates these scenarios. 



Figure 2: Learning Patterns with Bounded Private Beliefs 
(En = endogenous observation; Ex = exogenous observation) 
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6 Discussion 


6.1 Equilibrium Selection and Risk Dominance 


The conforming incentive generated by coordination motives results in multiple equilibria 
in an environment with large communities. Many of my previous results are built on the 
fact that conforming based on the most informed action and on a less informed action are 
both optimal responses for agents in the same community, which does not occur when 
agents are singletons because one’s unique best response would then be to use all available 
information. A natural question is then whether different equilibria can be compared in 
any way and, if so, whether a selected equilibrium by any criterion changes the implication 
of having coordination motives in the model. In this section, I propose risk dominance 
as an equilibrium selection method and discuss its properties and impact. In particular, 
this criterion is imposed on the interim stage in which signal and observation have been 
realized: it essentially enables comparison between two action profiles and selects a unique 
equilibrium action for each information set. 

Consider any with any community size and any information set /*. Let 
a*(/*) = {o^(L*)}n=i and denote two arbitrary action prohles with 

unanimous action. Let P) denote agent n’s expected payoff given and P. 

Definition 7. / say that a^{P) risk-dominates a^{P) if for every N'^ C and every 
n G N'^, we have 


vl.{a\P)J^)-vi{{{anP)},,Q>., 

>vUa\P)J^)-vi{{{al{P)h^QU, 




If a^{P) risk-dominates every other action profile for every P, we say that aI{P) is risk 
dominant. 


The idea behind risk dominance is the following: suppose that a subset of agents 
N'* C N* switches their action from a given profile to an alternative profile. If action 
prohle 1 risk-dominates action profile 2, then the expected loss for every agent in in 
switching from action prohle 1 to 2 is always larger than the loss involved in switching 
from 2 to 1, for every possible N'^ and information set P. One interpretation of risk 
dominance is that it indicates an agent’s preference for one action prohle over the other 
when she is uncertain about which will be used by others in her community. 


An intuitive candidate for a risk-dominant action profile is one in which each agent 
makes the best use of P, for which I provide a formal definition below. This is actually 
the generically unique risk-dominant action profile. 

Definition 8. An action profile is truth-seeking if for every t,n and P, n chooses the 
action that maximizes the probability of a\^ = 9 given P. 

Proposition 3. The truth-seeking action profile is risk dominant. 

It is easy to see that the truth-seeking action prohle yields the highest possible ex¬ 
pected payoff for every agent in given P. In fact, if we £x the number of agents that 
take a given action a, each agent’s payoff is the highest when a is truth-seeking. Hence, 
given a subset N'^ of action-switching agents, their loss is always less when switching 
from some other action profile to the truth-seeking profile rather than the opposite. I 
name the equilibrium with the truth-seeking prohle as a truth-seeking eguilibrium and 
denote it as and af under exogenous and endogenous observation, respectively. I then 
inspect how learning in this particular equilibrium changes according to the community 
size. 

When observation is exogenous, coordination motives have no effect on the general 
learning pattern in a truth-seeking equilibrium: regardless of G, asymptotic learning 
occurs if private beliefs are unbounded and if the observation structure has expanding 
observations and does not occur otherwise^. The truth-seeking action prohle prevents 
agents from conforming to a less informed action that uses more of their private informa¬ 
tion, and hence, the whole community acts as a single agent that makes her best ehort to 
match her action with the true state. The conforming incentive does not alter anything 
in the agents’ behavior regardless of how large a community becomes. 

When observation is endogenous, however, coordination motives still play an impor¬ 
tant role on learning in a truth-seeking equilibrium. For example, consider the following 
scenario. Suppose that the signal distribution is the same for every Q, i.e., f^ = fa, 
a = 0,1. {/o,/i} is symmetric with unbounded private beliefs. In addition, the agents 
have inhnite observations (limt^oo = oo)- 

•^To be precise, it has been proven that asymptotic learning does not occur when the observation 
structure does not have expanding observations or when private beliefs are bounded and the observation 
structure takes several typical forms. For a more specific account, see, e.g., Acemoglu et al.[l] and 
Song[35]. 
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Proposition 4. For every a^, we have limi_j.oo = 6) = Eg[Fo(s*(Q))], where 

s*{Q) is characterized by the following equation: 

, fo{s*m 


-u{l, 1, Q) + 


0, Q) = u{l, 1,Q) - c. 


fo{s*{Q)) + fi{s*{Q)) ’ ’ fo{s*{Q)) + fi{s*{Q)) 

Given the probability distribution of community size G, let Pt{G) denote the prob¬ 
ability of the correct action occurring at the limit in any truth-seeking equilibrium. We 
have the following corollary: 

Corollary 1. For G',G such that G' first-order stochastically dominates G, Pt{G') > 
Pt{G). 


In a truth-seeking equilibrium, coordination motives still encourage agents to observe- 
not because no observation or a “wrong” observation entails that they conform to a 
sub-optimal action as in the previous results but because observation generates greater 
expected beneht in a larger community. As a result, the range of signals leading to non¬ 
empty observation increases while the truth-telling property of observation is preserved. 
Therefore, a larger community size increases the limit learning probability, while the 
incremental improvement decreases in af. compared with the constructed equilibrium in 
Section 5 because asymptotic learning does not occur. 

When private beliefs are bounded, coordination motives can work in opposite direc¬ 
tions. As argued in Section 5, truth-telling observation occurs in alf whenever private 
beliefs are “strong” relative to cost, i.e., when the payoff of simply following an extreme 
signal exceeds that of knowing the true state by costly observation. Similar to the above 
proposition, the hrst payoff can be written as 1, Q) + 0, Q), 

while the second payoff is n(l, 1, Q) — c, which implies that the marginal effect of increas¬ 
ing Q is higher in the latter. We can then conclude that increasing the likelihood of a 
larger community size is better for social learning when private beliefs remain “strong” 
because, once again, it encourages observation that is still truth-telling. However, so¬ 
cial learning is harmed when private beliefs become weak because the informativeness of 
observation may overwhelm that of any private signal and induce herding. 


6.2 Herding with Imperfect Knowledge of Signals 


The previous analysis has shown that coordination motives can reduce herding and facili¬ 
tate efficient aggregation of private information. An important assumption leading to this 
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result is that all information - both private signals and observations - is shared within 
community and can be used as a correlating device. In this section, I show that when 
private signals are “private” in the most strict sense and cannot be shared, coordination 
motives have exactly the opposite effect: herding occurs even when private beliefs are 
unbounded. 

Assume that in every community N*, each agent n receives a private signal that 
cannot be observed by any other agent. Without loss of generality, assume that follows 
independent and identical distribution {Fq, Fi} conditional on the state, for every t and n. 
Consider an exogenous “line” network, i.e. = N^~^. In every symmetric equilibrium, 
i.e. equilibrium where agents in the same community use identical strategies, I hnd the 
following property: 

Proposition 5. There exists Q such that if G{Q > Q) = I, asymptotic learning never 
occurs in any equilibrium. 

This result stands in stark contrast to the case of signal sharing, that there exists an 
equilibrium with asymptotic learning whenever private beliefs are unbounded and some 
close predecessors are observed. The reason behind is that keeping signals private brings 
potential uncertainty about other agents’ actions. To achieve asymptotic learning, an 
agent’s action must at least sometimes contradict her observation (presumably when her 
signal is rather strong). However, when a highly informative observation realizes, since 
the observation is shared within the community and signals are independent, the agent 
can infer that no matter what the true state is, the other agents in her community will 
probably take the action that matches the observation. The existence of coordination 
motives then determines that her optimal action is to follow the observation as well. As 
a result, herding occurs with positive probability in every equilibrium even when private 
beliefs are unbounded. 

Herding in this environment highlights the importance of common beliefs on asserting 
the impact of coordination motives. When private signals are shared, agents have identical 
beliefs which make it possible for them to coordinate on either action in equilibrium when 
their community is large. As a result, they may agree to rely on private information only 
even though their observation is rather informative already. However, once each private 
signal is observed by the corresponding agent only, beliefs become heterogeneous across 

agents. The impossibility to ascertain one another’s private signal incentivizes every 
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agent to choose the “safer” option of following the commonly known observation, which 
then leads to herding. 

6.3 Separation Motives 

Thus far, I have focused on scenarios in which agents are willing to conform to a unan¬ 
imous action. However, in some cases, there may be a “congestion effect” on action, 
i.e., more agents choosing the same action results in a smaller payoff for each agent. For 
instance, too many customers squeezing into a restaurant will probably result in a neg¬ 
ative dining experience in terms of waiting time and noise level even if the restaurant is 
superior to its competitors in food quality. Consequently, a customer may actually prefer 
another restaurant with ordinary food but smaller crowds. 

In this section, I show how the model developed above can be used to analyze this 
opposite environment with separation motives and its impact on learning. Consider the 
payoff function u{6, a^, m), and replace the second assumption on u with the assumption 
that u is decreasing in m. 

Assume that observation is exogenous. For community N^, let P denote an arbitrary 
posterior probability that the true state is 1 given their signal and observation. The 
following result describes the pattern of equilibrium behavior: 

Proposition 6. If P > in every equilibrium (jf, at least half of the agents in N* choose 
action 1. 

This result asserts that regardless of community size, at least half of the agents will 
always take the more informed action. Here, separation motives work exactly the opposite 
of coordination motives: instead of urging agents to conform to an action taken by the 
majority, separation motives divide agents into two groups, always with more agents 
in the group representing better information. This situation results from the trade-off 
between a less crowded group and a more informative action. Moreover, it can be shown 
that for some classes of utility function (e.g., u is linear in as the community size 
rises, one can make increasingly precise inferences on the agents’ posterior belief from 
observing all actions in the community. Then, if observation is exogenous and more or 
less “complete”, i.e., at least at the limit, an agent observes almost the entire action 
history, the learning pattern is similar to that with singleton communities. Asymptotic 
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learning occurs when private beliefs are unbounded but never occurs otherwise. 

When observation is endogenous, a natural conjecture is that negative externalities 
discourage observation, and this possibility is conhrmed by the model. As the community 
size increases, the marginal beneht from observation decreases because the equilibrium 
actions are always split in certain proportions between 0 and 1. Hence, although truth¬ 
telling observation still occurs if inhnite observations can be made at the limit, the range 
of signals under which observation is non-empty is narrowed by separation motives. More¬ 
over, a “tragedy of commons” argument implies that more precise knowledge about the 
true state may actually decrease the total payoff in a community, and hence, the issue 
of discrepancy between equilibrium and efficiency arises, but I will not discuss this issue 
further in this paper. 

7 Conclusion 

In this paper, I studied the problem of Bayesian learning with coordination motives in 
various signal and observation structures. A large and growing body of literature on so¬ 
cial learning focuses on whether equilibria lead to efficient information aggregation, but 
most studies assume exogenous observation and no coordination motives. In many rele¬ 
vant situations, these two assumptions are overly simple. Individuals sometimes obtain 
their information not by some exogenous stochastic process but as a result of strategic 
choices. In addition, their payoffs may be directly affected by the actions of other individ¬ 
uals. This raises the questions of how different combinations of factors inffuence learning 
differently, under what circumstance asymptotic learning can be achieved, and how the 
results compare with benchmark cases studied in the literature. 

To address these questions, I formulated a sequential-move learning model that in¬ 
corporates all these elements. The basic decision sequence of the model follows the 
convention of Bikhchandani, Hirshleifer and Welch[7], Smith and Sorensen[33] and Ace- 
moglu et al.[l]: on a discrete time line, a signal about the underlying binary state is 
realized at the beginning of every period and is observed by each agent in that period 
only. Each agent takes a binary action at the end of their period, and meanwhile, she can 
observe some of her predecessors’ actions that are potentially informative. Nevertheless, 
my model differs from that used in most research in two fundamental aspects. First, in 
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the literature, there is usually only one agent in each period, whereas in my model, there 
is a community consisting of multiple agents. Within a community, agents share their 
information (reflected by the signal) and observation and take their actions simultane¬ 
ously. Also in contrast to the literature, in which each agent’s sole objective is to match 
her action with the state, an agent’s payoff from a given action is determined by both 
the state and the number of others in her community that take the same action. Second, 
observation is assumed to be exogenously given in much of the literature, whereas in this 
paper, 1 also analyze the case in which each agent can pay a cost to strategically choose 
a subset of her predecessors to observe. 

1 characterized pure-strategy (perfect Bayesian) equilibria for each observation struc¬ 
ture (exogenous and endogenous) and characterized the conditions under which asymp¬ 
totic learning can be obtained or approximated. When observation is exogenous, asymp¬ 
totic learning occurs if private beliefs are unbounded and observations are “expanding”, 
i.e., every observed neighborhood contains the action of some close predecessor over time. 
This result holds regardless of the community size. If private beliefs are bounded, for most 
common observation schemes, the probability of learning is bounded away from 1 when 
the community size is small, but it can become arbitrarily close to asymptotic learning 
when the community size is larger than a certain threshold. Coordination motives reduce 
herding and improve social learning in this case. 

When observation is endogenous, coordination motives also help to achieve better 
social learning but in a very different way. With a small community size, asymptotic 
learning never occurs because agents do not always observe: when the private signal is 
strong, it is not worthwhile to pay the observation cost for a small marginal expected 
beneht. However, when the community size becomes large, coordination motives encour¬ 
age observation even when the private signal is strong because the marginal beneht from 
observation increases with the number of agents in a community. Therefore, asymptotic 
learning (or nearly asymptotic learning) occurs even when observation is costly. 

1 also discussed the issue of equilibrium selection and proposed risk dominance as a 
selection criterion for the action prohle after both signal and observation are realized. In 
the selected equilibria, coordination motives do not affect learning at all when observation 
is exogenous, have a positive effect on learning when observation is endogenous and 
private beliefs are unbounded, and may either positively or negatively inhuence learning 
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when observation is endogenous and private beliefs are bounded. 

Beyond the specihc results presented in this paper, 1 believe that the framework de¬ 
veloped here can be applied to analyze learning dynamics in a more general and complex 
environment. The following questions are among those that can be studied in future 
work using this framework: (1) equilibrium learning when agents’ preferences are hetero¬ 
geneous, both over time and within a community; (2) the effect of coordination motives 
when agents in the same community make sequential decisions; and (3) equilibrium learn¬ 
ing when the strength of coordination motives depends on the true state. 
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APPENDIX 


Proof of Proposition 1. Suppose that there exist some a* and such that in N^, 
Q' G (1,Q*) agents choose action 1 and the others choose action 0 in equilibrium. Let 
P = Va{0 = for every agent that chooses action 1, we have 

Pu{i, 1, g') + (1 - p)n(i, 0, g') > Pn(i,0, g' - g' +1) + (1 - p)n(i, i, g* - g' +1). 

For every agent that chooses action 0, we have 

Pn(i, 0, g* - g') + (1 - p)n(i, 1, g* - g') > Pn(i, i, g' +1) + (i - p)n(i, o, g' +1). 

Combining the inequalities yields 

Pn(l, 1, Q') + (1 — P)n(l, 0, Q') > Pn(l, 1, g + 1) + (1 — P)m(1, 0, Q' 1), 

which is a contradiction. □ 

Proof of Theorem 1. I prove the result by establishing several lemmas. 

Lemma 1. There exists Q such that for any > Q and for any P, every action profile 
with unanimous action constitutes mutual best responses in N^. 

Proof. Without loss of generality, assume that every agent in W chooses action 1 given 
P. Let P denote the probability that 6 = 1 given P. For each agent, her expected 
payoff from action 1 is Pu{l, 1, g*) + (1 — P)u{l, 0, g*), while her payoff from action 0 is 
Pn(l, 0,1) + (1 — P)n(l, 1,1). For any P G (0,1), as long as g* is such that n(l, 0, g*) > 
'n(l,l,l), the agent’s expected payoff from action 1 is higher. Hence, the action prohle 
with unanimous action constitutes mutual best responses. □ 

Given an equilibrium cp*, consider an arbitrary subset of the set of time periods with 
inhnite complete observations, which consists of sufficiently many consecutive commu¬ 
nities of at least size Q starting from some time period k. From the assumption that 
G{Q P g) > 0, such a subset exists with probability 1. Let be the neighborhood 
consisting of the hrst k communities, and consider any agent in a community of size Q 
who has a private signal s and observes Let be the random variable of the 
posterior belief about the true state being 1 given each decision in B^. For each realized 
belief = r, we say that a realized private signal s and decision sequence h in B^ 
induce r if V^*{6 = l|h, s) = r. 
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Lemma 2. In any equilibrium cj)*, for either state 0 = 0,1 and for any s ^ S, we have 


lim (lim sup > 1 — e|0, s)) 

e^O+ fc^oo 

= lim (lim sup < e|l,s)) = 0. 

fc —>-00 

Proof. I prove here that hme_j.o+(limsupfc_^ooP 0 *(-R 0 i' > 1 — e|0, s)) = 0, and the second 
equality would follow from an analogous argument. Suppose that the equality does not 
hold; then s G S' and p > 0 exist such that for any e > 0 and any N ^ k > N exists 
such that > 1 — e|0, s) > p. Consider any realized action sequence h^ from 

that, together with s, induces some r > 1 — e, and let denote the set of all such action 
sequences; thus, we know that 

'PAh.W)f?.{s) 

Vr{h.\e)f^(s) + Vr{h.\e')f?.(s) 

Y. V4,.(h,\e) > p. 


The above two conditions imply that 


1 > v^^{K\e') > 


(1 -e)p/®(s) 


e/® (s) 




For sufficiently small e, we have ^ ^ contradiction. 


□ 


Lemma 3. There exists an equilibrium (p* such that given any realized belief r G (0,1) 
about state 1 for an agent observing Bk, for any r G (0, r) (f G (r, 1 )), N{r, f) G N exists 
such that a realized belief that is less than r (higher than r) can be induced by having 
complete observations of additional N{r, r) communities, each with unanimous action 0 

a)- 


Proof. Without loss of generality, assume that f G (0,r). We know that there is a 
private signal s and an action sequence h from Bj^ such that 

p^.(fe|i)/P(s) 

VfWDffM + Vt.(h\o)ffM' 

Let = 0 denote the event that action 0 is taken by every agent in the {k + l)th 
community. The new belief would then be 

_ Vr{h\l)f^{s)xV^4a^+^ = 0\h,l) _ 

P^*(i)(/i|l)/«(s) X P^*(a^+i = 0|/i,l) +P^*(h|0)/o'^(s) X = 0\h,0)' 

I now explicitly describe an equilibrium (p* that will prove this result. Consider the 

following strategy prohle for agents in an arbitrary community N^: 
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1. If g* < g, each agent takes the action that matches the state with higher 
probability according to signal and observation. 

2. If g* > Q: 

— Fix some e > 0. Let si(e),so(e) be such that F®(si(e)) — F{^(so(e)) = 
F^(si(e)) — FQ®(so(e)) = 1 — 2e. Each agent takes action 1 if s* > Si(e) 
and action 0 if s* < 5o(e). 

— Otherwise, each agent takes the action that matches the state with higher 
probability according to observation only. 


By Lemma 1, when Q is sufficiently large, both (1) and (2) constitute mutual best 
responses given /L Hence, the above strategy prohle is an equilibrium. We then have 


= 0|h, 1) =F«(so(e)) + (F«(si(6)) - Ff (so(e))) 

E (so(e), Si(e)), h, 1) 

= 0|h,0) =Fo«(so(e)) + - F,^{so{e))) 

E (so(e), si(e)), h, 0) 


By construction of (j)*, F®(so(e)) < FQ®(so(e)). Based on the assumption of complete 
observations, h consists of every action that agent k + 1 may have observed, which implies 
that = 0|s^+^ e (so(e), Si(e)), h, 1) = = 0|s^+^ e (so(e), Si(e)), h, 0). 

Hence, we know that the ratio (nfe+iZojfeo) has a < 1 upper bound that is independent 
of h. Let y denote this bound, and we have 

1 , W*(fe|0)/n^(d 

n ^ _ V,i,*{h\l)ff(s) 

r . , W*(fe|0)/|?(«) W* (a'=+i=o|fe,i) 


< 


< 1 . 


Note that the expression on the right-hand side above is increasing in r. Let denote 
the belief induced by h U • • • , where = • • • = aj^+^ = g. We have 


r\ 

r 


X 


< r X 


X 


X 


1 • 


rni-i r + (l-r)i'' '>^+(1 

Because r, ri, • • • , are decreasing, we have Vm < r x Hence, we can 

hnd the desired N{r,f) for any f E (0,r) such that a realized belief that is less than f 

can be induced by s and h U • • • , where = • • ■ = = q_ □ 
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Lemma 4. Consider the cj)* constructed above. Let d he the action that matches the state 
with higher probability given s and every action in Bk, and let V^*{d ^ 6 '|s) denote the 
probability that d does not match the state. We have {d 7 ^ 0|s) = 0. 


Proof. Suppose the opposite: noting that V^L{d ^ 0\s) must be weakly decreasing in 
/c, it follows that 7 ^ 6 *|s) > 0. Let p > 0 denote this limit. From Lemma 

2 , we know that for any a > 0 and for either true state 6 = 0,1, z E [|, 1 ) exists such 
that M e N exists such that maxjP^* > z\0, s),V^*{l — > 2 :|l,s)} < a for any 

k > M. If a = ip, then we have max{P<^*> z\0,s),V^*{l — > 2 :|l,s)} < |p 

for any k > M. Then, for any 5 > 0, we can hnd a sufficiently large k such that for any 
k' > k, ( 1 ) Vp'{d 7 ^ 6 '|s) e (p, p + 5) and ( 2 ) max{P 0 * > 2 ;| 0 , s), (1 - > 

2 ;|l,s)} < |p. Hence, we have 


!S(^) + /i“(s) 


fS(s) B, 

- Tore' 


e zl|o,s) + \o, ^ [ 2 ’'®) 




f^{s) + ff{s) 


=v^/{d ^e\s)- (<*'=' > ^|o, 5) 

/o(s) + /i(s) 


/(?(s) + /l('S) 




By Lemma 3, for any tt > 0, N{'k) = max{A^(z, 2 ^)) ^(1 ~ 2 ;, 1 — 2 ^)} ^ ^ exists 
such that whenever 9 = 0 and R^i E [\,z] 01 6 = 1 and 1 — R^i E [\,z], additional 
N[t[) observations can reverse an incorrect decision. Consider the following (sub-optimal) 
updating method for a rational agent who observes Bk' = Bk+Ni-K)- her action changes 
from 1 to 0 if and only if R^^ E [|, z], and = ■ ■ ■ = a^+^O = Q; her action changes 
from 0 to 1 if and only if 1 — R^lt E [^, z], and = ■ ■ ■ = = i. Let h denote 

a decision sequence from Bk that, together with s, induces such a posterior belief in the 
former case, and let h' denote a decision sequence from Bk that, together with s, induces 
such a posterior belief in the latter case. Let H and H' respectively denote the sets of 
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these decision sequences. We have 


vp(ai=e\s)-v;n&^e\^) 


Bu 






= ... = = 0 | 0 ) 


p,.(A, a”*' = ... = a‘+«<'> = 0 | 1 )) 

/o (s) + /l (s) 

f?w 


E 


h'^H' /if (^) + /!^(^) 


cQl 


(h', = • •. = a^+^W = 111 ) 


. . -P^*(h',a'^+i = --- = a'^+'^W = l|0)). 

/o®(s) + /?(s) 

From the proof of Lemma 3, we know that for every /i, 

P 0 *(h,a^+^ = ... = o!^+nM = 0 | 0 )/o^(s) 


> 


= ... = ak+NM = 0 | 0 )/o^(s) +p 0 *(/i,a^+i = ... = a^+^W = 0 |l)/i^(s) 

1 + TT 


■2 + 7r’ 

which implies that 

P^.(/i,a^+^ = ... = = 0|0)/o®(s) - P^*(h,a^+^ = ... = = 0|l)/i®(s) 

>7r/®(s)iP0*(/i,a"+' = ... = = 0|1). 

From the proof of Lemma 3, we know that the quantities Vtf,* = 0|/i, 1) and = 

l|/i, 0) have a > 0 lower bound that is independent of h. We denote this bound as w, and 
the above inequality can be written as 

P^*(h, = ... = = 0\0)f!^{s) - V^4h, = ■■■ = = 0\1) f^{s) 

>7r/«(s)n;^Wp^*(h|l). 


Based on the dehnition of h, we have 

1 ., 

2 " P<,-(ft|l)/?W+P<,-(A|0)/«M - ’ 

which implies that 

P<,.(/!|l)/?(s)>P,.(fe|0)/„«(s). 

Similarly, we have 

= ... = = l|l)/Q(s) - P^.(/i', a"+' = ... = = l|0)/o'^(s) 

>7r/o«(s)n;^Wp^*(h'|0), 
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and 


P*.(A'|0)/„®W>P*.(ft'|l)/?W. 


From the previous construction, we know that 

f?is) 

fois) + f^{s) 


Qf ^ Iq. X ^ Iq, s 






/o^(5) + /r(^) 
1 


fifis) + fris) 


>-p. 

2 


Combining the previous inequalities, we have 


rp(a + «|s) - Vp'(a # «|s) 

From the previous construction, we also know that 


Vp{a^e\s)-Vp{a^e\s)<5. 


Clearly, for some given tt > 0, a sufficiently small 6 exists such that > 6, which 

implies a contradiction. □ 


Lemma 4 implies that in the equilibrium </>* constructed in Lemma 3, truth-telling 
observation occurs. We can then compute the probability of taking the state-matching 
action: lim^^oo ^ 0 * («„ = 0) = 1 —iF®(so(e)) —1(1—^(^( 51 ( 6 )). From the characterization 
of so(e) and si(e), we know that F^{so{e)) -f -1 — F(^(si(e)) < F^{so{e)) -|-1 — F^ ('Si(e)) = 
2e, which implies that lim^^oo ^ 0 * (^n = 0) > 1 — e. □ 

Proof of Proposition 2. Consider an inhnite subset of communities , ■ ■ ■} 

such that W"+i’s observation includes iV*". By the assumption of expanding obser¬ 
vations, we know that each community belongs to at least one such subset. Then, it 
suffices to show that asymptotic learning occurs in each of these subsets. 

Consider the following equilibrium (p*: for every F, every agent in takes the action 
that matches the state with higher probability according to both signal and observation. 
Because actions are unanimous in equilibrium, I use a* to denote the equilibrium action 
in community N^. Suppose that the result does not hold, i.e., there exists • • • } 

such that lim„^ P^*(a*’* = 9) = P < 1. 
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For an arbitrary e > 0, find n such that = ^) ^ {P ~ P)- follows that 

there must be some community size Q' in support of G, such that either Vcf,*{a^" = 1\6 = 
1, = Q') < P or V^* (a*" = O|0 = 0, Q*" = Q') < P- Without loss of generality, assume 

that the second inequality holds. Based on the assumption of unbounded private beliefs, 

f^(s) (s) 

there exists Q such that lim^-j.! o/! Jo, ^ = 1 and hms^_i o/l o, . = 0. Consider 
^ /o^(d+/?h) f?{s)+f?{s) 

community with the realized community size Q and a sufficiently small private 

signal s*"+b If the agents in take their action according to signals only, each 

receives the expected payoff 


1, Q) + 


-M(i,o,g). 


/Q(stn+i) +/iQ(5Wi) ’ ’ 

If they simply follow the action in community iV*", each receives the expected payoff 

.Q, t = 0|« = 0'0‘" = 

+ = 1|0 = O,Q*" = Q')«(l,0,g)) 


= l\e = GQ^- = Q')u{lXQ) 


/Q(5in+l) + /®(s*" + l 

+ =O|0 = 1,Q*" =Q')«(l,0,g)) 


The difference between the two payoffs is bounded below by 

yQ(gWi) /i®(s‘-+i) 


-(1-F)- 


)(u(i,i,g) - M(i,o,g)). 


/Q(5in+l) + /«(5tn + l) fo{s^^+^) + 

Because private beliefs are bounded, there exists s' > — 1 such that this difference is 
positive when g (— 1,s'). It follows that there exists A > 0 such that = 

Q\Qtri+i ^ Q-j ^ = 0|g*" = Q') + A. Hence, we have P</,*(a‘"+^ = 9) > = 

9) + AG{Q')G{Q) > P — e + AG{Q')G{Q). Because e can be taken arbitrarily, when e is 
sufficiently small, we have (a*"+i = 9) > P, which is a contradiction. □ 


Proof of Theorem 3. Consider the following strategy profile a for agents in an arbi¬ 
trary community Ah 


1. Given any s*, agent 1 observes a\ and no other agent makes any observation. 

2. If /i* = then each agent in A* takes the action that matches the state 

with higher probability according to P. Otherwise, each agent takes the opposite 

action (the action that matches the state with lower probability). 
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By Lemma 1, the action profile given P specified above constitutes mutual best re¬ 
sponses when Q* is sufficiently large. If the payoff before cost for each agent 

in is bounded above by 1,Q*) + m( 1,0,(5*)); if the payoff before 

cost is bounded below by 


-{{Fq (s) + (1 - F^^ (s)))u(l, 1 , Q^) + (F® (s) (1 




where s is characterized by (s) = (s). When c is small, the difference between 

the two payoffs exceeds c for sufhciently large Q*. Hence, it is optimal to follow the 
observation decision in ( 1 ) above given that every other agent follows a, which means 
that cr is an equilibrium. 

Note that in a, starting from t = 2, agents in Q* always observe regardless of s*. 
Thus, we can apply Proposition 2 to obtain asymptotic learning in a. □ 


Proof of Theorem 4. Consider the following strategy prohle a for agents in an arbi¬ 
trary community N^\ 


1. Given any s*, agent 1 observes the neighborhood of size K{t) that maximizes 
Va{(F = 0\B^), and no other agent makes any observation. 

2. If h* = {ttm ■ rn G B^}: Fix some e > 0. Let si(e), so(e) be such that F® (si(e)) — 
Fi (so(e)) = Fq (si(e)) — Fq (so(^)) = 1 — 2e. An agent takes action 1 if s* > Si(e) 
and action 0 if s* < So(e). Otherwise, an agent takes the action that matches the 
state with higher probability according to observation only. 

3. If 7 ^ {ttm ■ m G F*}, each agent takes the action that matches the state with 
lower probability according to F. 


From the proofs of Lemma 1 and Theorem 3, a is an equilibrium. We can then apply 
Theorem 1 to prove the result. □ 

Proof of Proposition 3. Consider any community of size Q, C of size Q' 
and any F. Let a^{F) be the truth-seeking action prohle and a^{F) be an arbitrary 
action prohle with unanimous action. Without loss of generality, assume that a!'^{F) = 1 
and a^{F) = 0. Let F denote the probability that 6 = 1 given F. The dehnition of the 
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truth-seeking action profile implies that P Then, we have 

=Pn(l, 1, Q) + (1 - P)n(l, 0, Q) - (Pn(l, 0, Q') + (1 - P)n(l, 1, Q')) 

n - {«?(/*) P) 

=Pu{l, 0, g) + (1 - P)n(l, 1, Q) - (Pn(l, 1, Q') + (1 - PHP 0, Q'))- 
It follows that 


n^(a‘(P),P)-<(({a/(P)h,^n,{a5.(P)h^^n),/‘) 

=(2P - l)(n(l, 1, Q) - n(l, 0, Q')) + (1 - 2P)(«(1,0, Q) - u{l, 1, Q')) 
= (2P - l)(n(l, 1, Q) - u{l, 0, g') - (n(l, 0, Q) - u{l, 1, g'))) 

= (2P - l)(n(l, 1, g) - n(l, 0, g) + n(l, 1, Q') - u{l, 0, g')) > 0. 


Hence, the inequality is proven. □ 

Proof of Proposition 4. From Theorem 2, we know that truth-telling observation oc¬ 
curs in every Suppose that = Q. From the characterization of s*{Q), we know 
that for any s* G (— s*(g), s*(g)), agents in iV* prefer paying c to know the true state 
over paying nothing and acting according to sh It then follows that when t is sufficiently 
large, whenever s* G (—s*,s*), the equilibrium observation in is non-empty; other¬ 
wise, given the truth-seeking action prohle, any agent can be better off by paying c and 
observing a neighborhood of size K{t). Therefore, at the limit, an agent takes the correct 
action if and only if her signal lies in [— s*(g), s*(g)] and follows her signal otherwise. 
The probability of her action matching the state, = 9\Q’' = g), is then equal to 

Po(s*(g)). Finally, by taking expectation over the probability distribution G, we obtain 

□ 

Proof of Corollary 1. Note that Po(s*(g)) is increasing in Q. The result follows by 
applying the relevant property of hrst-order stochastic dominance. 

□ 
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Proof of Proposition 5. Suppose that there exists a symmetric equilibrium 0* with 
asymptotic learning. Since agents in the same community are using identical strategies 
by assumption, I use s* to denote the private signal of an arbitrary agent in N^. 

Without loss of generality, suppose that the true state is 1. Fix an arbitrary positive 
number ei. Since asymptotic learning occurs, we can hnd t such that: if the true state 
is 0 , for every t > t and every agent in N^, the agent takes action 0 with a probability 
of at least 1 — ei. Hence the probability that every agent in takes action 0 is at least 
(1 — ei)®*. By asymptotic learning and continuity of the signal distributions, for every 
positive number 62 we can hnd a sufficiently small ei and the corresponding t such that 
an arbitrary agent in takes action 0 with a probability of at least 1 — 62 under either 
state, given that each agent in W has taken action 0. Note that since t is hnite, the 
probability that each agent in N* takes action 0 under the true state 1 is still positive 
and bounded away from 0 . 

Now consider the optimal action of an agent in Signals are independent, so with 

probability of at least (1 — 62 )®*^^”^ all the other agents in will choose action 0. If 
the agent chooses action 0 , her payoff is bounded below by (1 — 0 , 

if she chooses action 1, her payoff is bounded above by max{(l — 62 -■u(l.l,l) + 

When is sufficiently large, we can hnd sufficiently small 62 such that the hrst bound 
is higher than the second bound. Hence, the agent will choose action 0 regardless of her 
private signal. This argument then works for every t' > t + 1, which means that herding 
occurs. This is a contradiction to asymptotic learning. □ 

Proof of Proposition 6. In any equilibrium, the number of agents choosing action 1, 
denoted Q{, must satisfy 

Pu{l, 1, Q\) + (1 - P)n(l, 0, Q\) > (1 - P)u{l, l,Q^-Q\ + l) + Pn(l, 0, Q* - Q{ + 1) 
(1 - P)n(l, 1, Q' - Q\) + Pu{l, 0, - Q\) > Pu{l, 1, Q\ + 1) + (1 - P)n(l, 0, Q{ + 1). 

From the second inequality, we have 

u{l, 1, Q' - Q\) - u{l, 0, Q{ + 1) > 1, Q\ + 1) - 0, Q* - Q{)) 

From the assumption that F > 1 , we know that either ( 1 ) m(1, 1 , Q^ — Q\) 0 , Q\ + l) 

is positive and n(l, 1, — Q\) — n(l, 0, Q\ + l) > m(1, 1, Q\ + l) — u{l, 0, — Q\) or (2) 

both u{l, 1, — 0 , Q\ + l) and u{l, 1, Q\ + l)—u{l, 0 , Q* — Q\) are non-positive. 
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Assume that (1) holds. Note that u(l, 1, Q* — Q\) — u{l, 1, Q\ + 1) and u{l, 1, — 

Q\) — u{l, 1, Q\ + 1) have the same sign. Then, from n(l, 1, Q* — Q\) — u{l, 0, Q\ + l) > 
n(l, 1, Q\ + l) — n(l, 0, — Q\), we have — Q\ < Qi + 1, which implies that Q\ > 

Assume that (2) holds. Then, from the two non-positive expressions, it follows that 
— Q\ > Qi +1 and Qi +1 > — Q\, which is a contradiction. Hence, we can conclude 

that the only possible case is (1), and thus, Q\> □ 
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